41 research outputs found
Improving Surgical Training Phantoms by Hyperrealism: Deep Unpaired Image-to-Image Translation from Real Surgeries
Current `dry lab' surgical phantom simulators are a valuable tool for
surgeons which allows them to improve their dexterity and skill with surgical
instruments. These phantoms mimic the haptic and shape of organs of interest,
but lack a realistic visual appearance. In this work, we present an innovative
application in which representations learned from real intraoperative
endoscopic sequences are transferred to a surgical phantom scenario. The term
hyperrealism is introduced in this field, which we regard as a novel subform of
surgical augmented reality for approaches that involve real-time object
transfigurations. For related tasks in the computer vision community, unpaired
cycle-consistent Generative Adversarial Networks (GANs) have shown excellent
results on still RGB images. Though, application of this approach to continuous
video frames can result in flickering, which turned out to be especially
prominent for this application. Therefore, we propose an extension of
cycle-consistent GANs, named tempCycleGAN, to improve temporal consistency.The
novel method is evaluated on captures of a silicone phantom for training
endoscopic reconstructive mitral valve procedures. Synthesized videos show
highly realistic results with regard to 1) replacement of the silicone
appearance of the phantom valve by intraoperative tissue texture, while 2)
explicitly keeping crucial features in the scene, such as instruments, sutures
and prostheses. Compared to the original CycleGAN approach, tempCycleGAN
efficiently removes flickering between frames. The overall approach is expected
to change the future design of surgical training simulators since the generated
sequences clearly demonstrate the feasibility to enable a considerably more
realistic training experience for minimally-invasive procedures.Comment: 8 pages, accepted at MICCAI 2018, supplemental material at
https://youtu.be/qugAYpK-Z4
Surgical Phase and Instrument Recognition: How to identify appropriate Dataset Splits
Purpose: The development of machine learning models for surgical workflow and
instrument recognition from temporal data represents a challenging task due to
the complex nature of surgical workflows. In particular, the imbalanced
distribution of data is one of the major challenges in the domain of surgical
workflow recognition. In order to obtain meaningful results, careful
partitioning of data into training, validation, and test sets, as well as the
selection of suitable evaluation metrics are crucial. Methods: In this work, we
present an openly available web-based application that enables interactive
exploration of dataset partitions. The proposed visual framework facilitates
the assessment of dataset splits for surgical workflow recognition, especially
with regard to identifying sub-optimal dataset splits. Currently, it supports
visualization of surgical phase and instrument annotations. Results: In order
to validate the dedicated interactive visualizations, we use a dataset split of
the Cholec80 dataset. This dataset split was specifically selected to reflect a
case of strong data imbalance. Using our software, we were able to identify
phases, phase transitions, and combinations of surgical instruments that were
not represented in one of the sets. Conclusion: In order to obtain meaningful
results in highly unbalanced class distributions, special care should be taken
with respect to the selection of an appropriate split. Interactive data
visualization represents a promising approach for the assessment of machine
learning datasets. The source code is available at
https://github.com/Cardio-AI/endovis-mlComment: Accepted at the 14th International Conference on Information
Processing in Computer-Assisted Interventions (IPCAI 2023); 9 pages, 4
figures, 1 tabl
mvHOTA: A multi-view higher order tracking accuracy metric to measure spatial and temporal associations in multi-point detection
Multi-point tracking is a challenging task that involves detecting points in
the scene and tracking them across a sequence of frames. Computing
detection-based measures like the F-measure on a frame-by-frame basis is not
sufficient to assess the overall performance, as it does not interpret
performance in the temporal domain. The main evaluation metric available comes
from Multi-object tracking (MOT) methods to benchmark performance on datasets
such as KITTI with the recently proposed higher order tracking accuracy (HOTA)
metric, which is capable of providing a better description of the performance
over metrics such as MOTA, DetA, and IDF1. While the HOTA metric takes into
account temporal associations, it does not provide a tailored means to analyse
the spatial associations of a dataset in a multi-camera setup. Moreover, there
are differences in evaluating the detection task for points when compared to
objects (point distances vs. bounding box overlap). Therefore in this work, we
propose a multi-view higher order tracking metric (mvHOTA) to determine the
accuracy of multi-point (multi-instance and multi-class) tracking methods,
while taking into account temporal and spatial associations.mvHOTA can be
interpreted as the geometric mean of detection, temporal, and spatial
associations, thereby providing equal weighting to each of the factors. We
demonstrate the use of this metric to evaluate the tracking performance on an
endoscopic point detection dataset from a previously organised surgical data
science challenge. Furthermore, we compare with other adjusted MOT metrics for
this use-case, discuss the properties of mvHOTA, and show how the proposed
multi-view Association and the Occlusion index (OI) facilitate analysis of
methods with respect to handling of occlusions. The code is available at
https://github.com/Cardio-AI/mvhota.Comment: 16 pages, 9 figure
Self-supervised motion descriptor for cardiac phase detection in 4D CMR based on discrete vector field estimations
Cardiac magnetic resonance (CMR) sequences visualise the cardiac function
voxel-wise over time. Simultaneously, deep learning-based deformable image
registration is able to estimate discrete vector fields which warp one time
step of a CMR sequence to the following in a self-supervised manner. However,
despite the rich source of information included in these 3D+t vector fields, a
standardised interpretation is challenging and the clinical applications remain
limited so far. In this work, we show how to efficiently use a deformable
vector field to describe the underlying dynamic process of a cardiac cycle in
form of a derived 1D motion descriptor. Additionally, based on the expected
cardiovascular physiological properties of a contracting or relaxing ventricle,
we define a set of rules that enables the identification of five cardiovascular
phases including the end-systole (ES) and end-diastole (ED) without the usage
of labels. We evaluate the plausibility of the motion descriptor on two
challenging multi-disease, -center, -scanner short-axis CMR datasets. First, by
reporting quantitative measures such as the periodic frame difference for the
extracted phases. Second, by comparing qualitatively the general pattern when
we temporally resample and align the motion descriptors of all instances across
both datasets. The average periodic frame difference for the ED, ES key phases
of our approach is , which is slightly better
than the inter-observer variability (, ) and the
supervised baseline method (, ). Code and labels
will be made available on our GitHub repository.
https://github.com/Cardio-AI/cmr-phase-detectionComment: accepted for the STACOM2022 workshop @ MICCAI202